Generalized Bose-Fermi statistics and structural correlations in weighted networks 
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We derive a class of generalized statistics, unifying the Bose and Fermi ones, that describe any 
system where the first-occupation energies or probabilities are different from subsequent ones, as 
in presence of thresholds, saturation, or aging. The statistics completely describe the structural 
correlations of weighted networks, which turn out to be stronger than expected and to determine 
significant topological biases. Our results show that the null behavior of weighted networks is 
different from what previously believed, and that a systematic redefinition of weighted properties 
is necessary. 
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The Fermi-Dirac and Bosc-Einstein distributions de- 
scribe systems whose states can be discretely populated, 
at most once or an infinite number of times respectively. 
Even if they were originally introduced to model quan- 
tum particles, they turn out to describe a wider range 
of systems, including traffic [1] and complex networks 
[2,3]. It is therefore not surprising that extensions of 
these distributions are indicated not only by quantum 
theory itself (e.g. anyons and supersymmetry) , but also 
by other research fields where they are encountered [2]. 
In this Letter, starting from a problem arising in net- 
work theory, we derive a class of generalized statistics 
that unify the Bose-Einstein and Fermi-Dirac ones by 
extending them in two directions simultaneously: first, 
the maximum occupation number of a state is any inte- 
ger between one and infinity; second, the first-occupation 
energies may be different from next-occupation ones. A 
natural application is to social networks, where establish- 
ing a link representing mutual acquaintance between two 
people is more costly than reinforcing an already existing 
link. Clearly, several systems are characterized by this 
mechanism, where an extra energy is initially required 
to overcome a threshold, or by the opposite one, where 
repeated occupations are energetically suppressed due to 
saturation or aging. Thus, even if we derive the statistics 
in the context of networks, they have a wider and more 
abstract range of application. 

A network or graph is a set of N vertices connected by 
L links or edges. It is characterized by local properties 
such as the degree kj = y\- ay (number of links emanat- 
ing from vertex i, where ay = 1 if a link exists between 
i and j, and ay = otherwise), and by higher-order 
correlations, such as the dependence on ki of the average 
degree fc" n = J2j aijkj/h of i's neighbors and the clus- 
tering coefficient a = J2jk a ij a jk^ki/ki(ki — 1). In real 
unweighted networks, patterns that were first interpreted 
as nontrivial [4,5] are now understood as mere effects of 
the lower-level graph structure. For instance, in a ran- 
dom network where only the degree sequence {ki}fL 1 is 
specified (the configuration model), the probability that 
the vertices i and j are connected was expected [6] to be 



Pij — XiXj 



(1) 



where Xi — kijylL and L = J2i ki/2. This implies that 
(ki) = J2jPij = k% [6], and that (fcf n ) and (cj) are inde- 
pendent of h (where (• • •) denotes an ensemble average). 
Deviations from these flat behaviors were interpreted as 
a signature of higher-order correlations [4,5]. However 
it was later shown that, even for random networks with 
specified degrees, fc™ n and Cj decrease with ki [7,8]. In- 
deed, eq.(I) is not the correct probability for large kikj, 
since in this case p^ > 1, corresponding to undesired 
multiple edges [8]. The constraint p^ < 1 can be en- 
forced by fixing a structural cut-off k max ~ \N on the 
maximum degree [9] . However this fails to reproduce real 
networks, such as the Internet [7], where k max far exceeds 
this value. Thus the local properties alone unavoidably 
determine higher-order 'structural correlations' [7,8]. 

Structural correlations can be studied analytically us- 
ing exponential random graphs [3], representing the en- 
semble of maximally random networks with specified 
properties {^i}i, each governed by a control parameter 
0i. A graph G in the ensemble is assigned the prob- 
ability P(G) = e - H(G yZ, where H(G) = T,i e m(G) 
is the graph Hamiltonian and Z = EG ex P[ — H(G)] is 
the partition function [3]. Any unweighted graph G is 
fully specified by its adjacency matrix A, with entries 
{ay}. Thus for maximally random graphs with speci- 



fied degrees [8] H(A) 
and P(A) = n, 
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where p^ is the probability that i and j are linked 
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dj%JL j 
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where Xi = e~ ai is no longer a function of ki alone. The 
above Fermi-Dirac distribution is the correct null form 
for p^ [8]. Since p^ now does not depend only on ki 
and kj, higher-order effects are generated even if only 
local properties are fixed: unlike cq.(l), eq.(2) correctly 
predicts that (k" n ) and (a) decrease with (ki) [8]. Thus 
purely uncorrclatcd unweighted networks do not exist. 

While the unweighted case is well understood, 
weighted networks are more controversial. On one hand, 



since structural correlations are due to the 'fermionic' 
constraint disallowing multiple edges [8], they are un- 
expected for weighted graphs, where large weights Wij 
(equivalent to multiple edges [10]) are allowed. In par- 
ticular, if Si = ^2jWij denotes the strength of vertex 
i, random weighted networks with specified strength se- 
quence {si}^ (we denote this null model as model 3) are 
expected [11,12] to follow a weighted version of eq.(l): 



ViVj 



(3) 



where y-i = Si/^/stot an d s to t = Ei s i- This restores the 
expected degree-independent behavior for the weighted 
analogues of kf n and a, defined as kf = J2j w ij k j/ S i 
(weighted average nearest neighbor degree, or affinity) 
andc™ = Y,j,k( w ii+ w ik) a ij a ik a 3 kl['2si(ki-l)\ (weighted 
clustering coefficient) respectively [13,14]. On the other 
hand, theoretical results [3,15] (that we confirm and ex- 
tend later on) indicate that (uiij) has a different form, 
even if the effects on network properties have never been 
studied. Another indicator of correlations is the disparity 
Y t = Y J j w %l s 1 [ 14 > 16 ]- :t is expected that Y, L w 1/fc, if 
weights are equally distributed among z's neighbors, and 
that a larger value of the latter signals an excess concen- 
tration of weight in one or more links [14,16]. Similarly, 
the modularity [10,17] (measuring whether the network 
is partitioned into communities) is expected to vanish 
for random weighted networks. However, the null be- 
havior of both properties has never been studied system- 
atically. The nonlinear dependence of si on ki is inter- 
preted as another indicator of correlations [13,14], since 
if the topology is kept fixed and the weights are glob- 
ally reshuffled on it [13,14], then (wy) = way (where 
w is the average non-zero weight in the network), im- 
plying (si) = wki. However, in this different null model 
(model 1) (k™) and (df) equal their unweighted counter- 
parts (k™ and Ci), and thus inherit any purely topolog- 
ical correlation [13]. One can partly remove these cor- 
relations by globally reshuffling the weights and simulta- 
neously randomizing the topology in a degree-preserving 
way [18] (model 2). However, the unweighted structural 
correlations discussed above will still remain. The situ- 
ation becomes even more intricate when both strengths 
and degrees are prescribed (model 4) [19,20]. This case is 
difficult to inspect without further assumptions. A first 
interesting result [19] is that it is impossible to decou- 
ple purely topological and weighted quantities to obtain 
completely independent local properties. However the as- 
sumption of factorized marginal probabilities, leading to 
an expression analogous to eq.(3) where yi oc Si/ki, was 
made [19]. In what follows we go one step further and 
show that such constraints represent only a part of the 
problem. We find that the full structural correlations are 
remarkably stronger, and described in the most general 
case by mixed Bose-Fermi statistics. 

We look for the analytical solutions of the four null 
models in terms of the probability Qij(w) that i and j 
are joined by a link of weight w (including w = when 



no link is there). The probability P(W) of a graph with 
weight matrix W (having entries w%j > 0) is 



P(W) = '[[q ij (w ij ) 



(4) 



K'J 



Without loss of generality we assume integer weights, as 
in standard approaches [10,12,19]. Then J2Z=a < i i j( w ) ~ 
1 Vi,j, where w* is the maximum allowed weight. The 
entries of the adjacency matrix are ay = 0(u;y), where 
Q(x) is the Heaviside function. The probability py that i 
and j are connected by a link, irrespective of the weight 
of the latter, is py = Em,>o*:/( w ) = 1 ~ Qij(ty- All 
expectation values are completely specified by ay (w) : 

(**> = $>« = #-!>« (o) 
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T,j( w ij k j) T,j( w ij)((kj) + 1 -Pij) 



J2 3 k(( W ij + w ik)aija ik a :jk ) J2jk( W ij)PikPjk 



2( Si {ki - 1)) 



(Si)(ki) -T,j( w ij)PiJ 






(EfcKfc)) 2 [J2 k J2 w >o W( iik(w)} 2 

We shall also consider the modularity later on. We now 
reformulate models 1-4 as exponential random graphs. In 
model 1, the whole topology (each entry of ay) is fixed. 
The only constraint on w%j is then ©(wy) = ay: 

i<j i<j 

In models 2-4 the constraints are {ki}fL 1 and/or {sj}^: 

H 2 (W) = J2 <*i k i = J2^ + a j) @ ( W ij) ( 6 ) 

i i<j 



H 3 (W) = J2 Pi*i = £(& + Pi)Wj (7) 

i i<j 

H 4 (W) = J2K<Xi + ajMwij) + (fit + PAwd (8) 



i<j 



We note that the above models are all particular cases of 
H(W) = J2lan®M + fawn] (9) 



The corresponding P(W) can be expressed as follows 

-H(W) _ - _ r e -aij@(wij)-f3ijWij 

1 -I- 



P(W) 



^„, ,-».»<. n 1+ra „ E: . =ie - Mj 



Thus, by comparison with eq.(4), we find analytically 



e(i 



Qij (w) = — 






where we have set 



and 



Vij 



= e-ft 



(10) 



The 



above class of generalized statistics, interpolating be- 
tween the Fermi-Dirac (yij = 1, orio, = 1) and Bosc- 
Einstein (xij — 1 and w* = +oo) ones, is our main re- 
sult. It applies to any system described by eq.(9), and 
represents the probability that its states are populated w 
times. Even if multiple occupations are allowed (which 
is a property of bosons), the first occupation is necessar- 
ily binary (which is a property of fermions). Depending 
on the sign of a^ , the first occupation (whose energy is 
a ij + Pij) is either favoured or suppressed with respect 
to all other occupations, whose energies are /3y. 

We now turn to the four models separately. Models 1 
{Vij = 1) and 2 (for which additionally Xij — XiXj) yield 



&(w) 



Qij M 



Since w = Ei 



LU^ LL"ij 



Pij 



{Wi 



1 + W^Xi 



=1 w/w* and w 2 = Ei 

/u.\ — x^ w ' x a 
\*>%l — l^j i+ w , Xij 

u,=l w = W P*J =► 



_j w 2 /w*, we have 

(11) 
?i) =w(h 






(Yi) 



w{ki) 

wY. ik PijPikPjk 

w{ki(ki-l)) 
w 2 Z)j Pij 



w i (k i ) 



(12) 

(13) 
(14) 

(15) 



Equations (12-14) show that (s^ ex ki and that (kf) 
and (cf) inherit from (k" n ) and (cj) any dependence on 
fcj, thus conveying information only relative to their un- 
weighted counterparts. Moreover, eq.(15) implies (Yi) ^> 
l/(ki), since w 2 /w 2 ^S> 1 for real networks with broadly 
distributed weights [13,14]. However, this reflects the 
overall weight distribution and does not indicate a local 
weight imbalance, as usually interpreted [14,16]. These 
problems arise due to purely fermionic correlations. 
We now consider model 3 (x^ — 1, y^ = ViVj)' 



Qij H 



(yiVj) w (i - yiVj) 
i-ta)*" +1 ' PlJ 



ViVi - (yiyj) w * + 
i - (y i y J ) w ' +1 



All the expected properties can again be computed ana- 
lytically. Their behavior is well revealed by the ratio 



(Wi 



1 



*(.ytyiY 



Vij 



i - yiVj i - (yiVj) 1 



(16) 



which is plotted in fig.l. Note that (sj) = w(ki) is 
no longer a correct prediction, since the expectation 
(wij)/pij = w is never realised. Similarly, eq.(3) does 
not hold. All quantities can be calculated for any value 
of iv*. For brevity, we only report the case u>* = +oo: 
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FIG. 1. Ratio (Wij)/pij as a function of ytyj for models 3 
and 4. The values of w, obtained for inyj — 1 (models 1 and 
2), are highlighted as larger points. 
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(17) 
(18) 

(19) 

(20) 
(21) 



where y = Ei2/*/-^- As c l ear from eq.(17), yi ex (fcj) 
and therefore all the above quantities display a nontriv- 
ial dependence on the degree. For instance, we can study 
scale-free networks by considering a power-law distribu- 
tion p(y) ex y~~< for the j/i's (implying P(k) ex fc~ 7 ), 
and approximating the discrete sums with (analytically 
solvable) integrals: Ej — * N j dyp(y). The resulting 
curves, shown in fig. 2, strongly contradict the expecta- 
tions [13,14,16] that one should observe Si ex ki, and all 
other curves flat. While fermionic correlations yield dis- 
assortative trends, bosonic correlations generate assor- 
tative patterns. These constraints have a deeper origin 
than those studied in [19], where the unavoidable depen- 
dence of weights on connectivity was considered. Here we 
find that even if only the strengths (and not the degrees) 
are fixed, then (w) does not factorize. 

Finally, in model 4 (xij = XiXj, yij — y%yj) the ratio 



j)/pij is still given by eq.(16). Now qij(w) reads 



,H = 



(x i x j ) e M(y i y J r(l-y i y j ) 
1 - ViVi + x i x jViVj ~ x i x :j (y i y j ) 1+w * 



(22) 



Here one sees that, even if Xi and yi are chosen as sta- 
tistically independent, the resulting weighted and purely 
topological quantities are not independent of each other. 
This is the effect studied in [19] that we automatically re- 
cover here. However, eq.(22) also takes into account both 
bosonic and fermionic constraints. Therefore, unlike [19], 
here we do not need to restrict ourselves to sparse net- 
works. We conclude that for models 3 and 4 the available 
weighted measures are uninformative, either in an abso- 




this is the case, the Bose-Fermi statistics in eq.(10) will 
naturally describe such real systems. 
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FIG. 2. Analytical results for random networks with 
strength sequence generated by the distribution p(y) oc j/ -7 
with (from top to bottom) 7 = 1,2,3,4. All networks have 
the same link density and TV = 10000 vertices. 

lute or in a relative sense. Thus a systematic redefinition 
of weighted network properties is necessary. 

Structural correlations also affect the modularity of a 
partition of a network into communities, defined as 



E 



2L 



h ■ j h j 



E 

i<3 



Stot 



in the unweighted and weighted case respectively [10], 
where cy = 1 if i and j belong to the same community, 



and a 



otherwise. For a non-modular network with 



only local constraints, one expects (Q) = since the dif- 
ferences in the square brackets are expected to vanish ac- 
cording to eqs.(l) and (3). However, we have shown that 
these expectations are wrong. Thus (Q) ^ even for 
random graphs, which means that the modularity of real 
networks is unavoidably biased and does not entirely rep- 
resent a signature of community structure. Interestingly, 
the reverse operation, i.e. randomizing an unweighted 
network keeping both the modularity and the degree se- 
quence fixed, has been shown [17] to reproduce most of 
the observed degree-degree correlations. 

Our formalism treats null models in a unified fashion, 
but it clearly cannot indicate a priori the most appro- 
priate null model for a specific network. Nonetheless, 
the identification of the Hamiltonian corresponding to 
each model allows deep insights into network structure. 
For instance, we can interpret in a new light the results 
[19,20] showing that some real networks, such as the US 
airport network and the World Trade Web, remain al- 
most unchanged after randomizations that preserve both 
strengths and degrees. Indeed, for such networks the 
establishment of a link for the first time requires an ex- 
tra cost (a transportation channel and/or a trade agree- 
ment), while on already existing links any further interac- 
tion is facilitated. In general, for these and other systems 
(including our initial example of social networks) eq.(9) 
may be already a good model, not simply a null one. If 
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